Picture for Xi Wang

Xi Wang

Cross-Modal Clinical Knowledge Integration for Mammography Report Generation

Add code
May 29, 2026
Viaarxiv icon

The 2nd EReL@MIR Workshop on Efficient Representation Learning for Multimodal Information Retrieval

Add code
May 26, 2026
Viaarxiv icon

CapVector: Learning Transferable Capability Vectors in Parametric Space for Vision-Language-Action Models

Add code
May 11, 2026
Viaarxiv icon

RefEvo: Agentic Design with Co-Evolutionary Verification for Agile Reference Model Generation

Add code
Apr 27, 2026
Viaarxiv icon

Bringing a Personal Point of View: Evaluating Dynamic 3D Gaussian Splatting for Egocentric Scene Reconstruction

Add code
Apr 26, 2026
Viaarxiv icon

TTS-PRISM: A Perceptual Reasoning and Interpretable Speech Model for Fine-Grained Diagnosis

Add code
Apr 24, 2026
Viaarxiv icon

Test-Time Perturbation Learning with Delayed Feedback for Vision-Language-Action Models

Add code
Apr 20, 2026
Viaarxiv icon

Learning from Contrasts: Synthesizing Reasoning Paths from Diverse Search Trajectories

Add code
Apr 13, 2026
Viaarxiv icon

A Tale of Two Temperatures: Simple, Efficient, and Diverse Sampling from Diffusion Language Models

Add code
Apr 10, 2026
Viaarxiv icon

The Myth of Expert Specialization in MoEs: Why Routing Reflects Geometry, Not Necessarily Domain Expertise

Add code
Apr 10, 2026
Viaarxiv icon